Approximating High-Dimensional Range Queries with kNN Indexing Techniques
نویسندگان
چکیده
While k-nearest neighbor queries are becoming increasingly common due to mobile and geospatial applications, orthogonal range queries in high-dimensional data are extremely important in scientific and web-based applications. For efficient querying, data is typically stored in an index optimized for either kNN or range queries. This can be problematic when data is optimized for kNN retrieval and a user needs a range query or vice versa. Here, we address the issue of using a kNN-based index for range queries, as well as outline the general computational geometry problem of adapting these systems to range queries. We refer to these methods as space-based decompositions and provide a straightforward heuristic for this problem. Using iDistance as our applied kNN indexing technique, we also develop an optimal (data-based) algorithm designed specifically for its indexing scheme. We compare this method to the suggested näıve approach using real world datasets and results show that our data-based algorithm consistently performs better.
منابع مشابه
Extending High-Dimensional Indexing Techniques Pyramid and iMinMax(θ): Lessons Learned
Pyramid Technique and iMinMax(θ) are two popular highdimensional indexing approaches that map points in a high-dimensional space to a single-dimensional index. In this work, we perform the first independent experimental evaluation of Pyramid Technique and iMinMax(θ), and discuss in detail promising extensions for testing k -Nearest Neighbor (kNN) and range queries. For datasets with skewed dist...
متن کاملA Comprehensive Study of iDistance Partitioning Strategies for kNN Queries and High-Dimensional Data Indexing
Efficient database indexing and information retrieval tasks such as k -nearest neighbor (kNN) search still remain difficult challenges in large-scale and high-dimensional data. In this work, we perform the first comprehensive analysis of different partitioning strategies for the state-of-the-art high-dimensional indexing technique iDistance. This work greatly extends the discussion of why certa...
متن کاملConstructing an Effective and Secure Query Services with Rsap Data Perturbation in the Cloud
Now a day’s cloud is more popular because in cloud users host the data and upload a large contained data. It has large databases to database service providers so database service providers maintain the services of range query services. In clouding process some users have a sensitive private data in that situation user’s can’t move the data for hosting until we provide security, confidentiality,...
متن کاملMinimizing the Number of Keypoint Matching Queries for Object Retrieval
To increase the efficiency of interest-point based object retrieval, researchers have put remarkable research efforts into improving the efficiency of kNN-based feature matching, pursuing to match thousands of features against a database within fractions of a second. However, due to the high-dimensional nature of image features that reduces the effectivity of index structures (curse of dimensio...
متن کاملDistributed computation of the knn graph for large high-dimensional point sets
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for...
متن کامل